Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition

نویسندگان

چکیده

Gesture recognition is getting more and popular due to various application possibilities in human-machine interaction. Existing multi-modal gesture systems take data as input improve accuracy, but such methods require modality sensors, which will greatly limit their scenarios. Therefore we propose an end-to-end multi-task learning framework training 2D convolutional neural networks. The can use the depth accuracy during save costs by using only RGB inference. Our trained learn a representation for learning: segmentation recognition. Depth contains prior information location of gesture. it be used supervision segmentation. A plug-and-play module named Multi-Scale-Decoder designed realize segmentation, two sub-decoder. It lower stage higher respectively, help network pay attention key target areas, ignore irrelevant information, extract discriminant features. Additionally, MSD are performance. Only without required Experimental results on three public datasets show that our proposed method provides superior performance compared with existing frameworks. Moreover, other CNN-based frameworks also get excellent improvement.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Challenges in Multi-modal Gesture Recognition

This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectT Mrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras,...

متن کامل

Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling

Most of the existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly learning features from raw RGB-D data, but the performance is not satisfactory. In this paper, we adapt the unsupervised feature learning technique for RGB-D labeling as a multi-modality learn...

متن کامل

Multi-Modal Multi-Task Deep Learning for Autonomous Driving

Several deep learning approaches have been applied to the autonomous driving task, many employing end-toend deep neural networks. Autonomous driving is complex, utilizing multiple behavioral modalities ranging from lane changing to turning and stopping. However, most existing approaches do not factor in the different behavioral modalities of the driving task into the training strategy. This pap...

متن کامل

Multi-modal Multi-task Learning for Automatic Dietary Assessment

We investigate the task of automatic dietary assessment: given meal images and descriptions uploaded by real users, our task is to automatically rate the meals and deliver advisory comments for improving users’ diets. To address this practical yet challenging problem, which is multi-modal and multi-task in nature, an end-to-end neural model is proposed. In particular, comprehensive meal represe...

متن کامل

Bayesian Co-Boosting for Multi-modal Gesture Recognition

With the development of data acquisition equipment, more and more modalities become available for gesture recognition. However, there still exist two critical issues for multimodal gesture recognition: how to select discriminative features for recognition and how to fuse features from different modalities. In this paper, we propose a novel Bayesian Co-Boosting framework for multi-modal gesture ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Sensors Journal

سال: 2021

ISSN: ['1558-1748', '1530-437X']

DOI: https://doi.org/10.1109/jsen.2021.3123443